Does Category A Anchor Text Improve Category B Results?

نویسنده

  • Leonid Boytsov
چکیده

Associating anchor text with pages, to which links are pointing, is a well-known approach to improve retrieval quality. It was used in the first version of Google [Brin and Page 1998]. On one hand, using the anchor text alone allows one to obtain a system with decent performance [Anh and Moffat 2010; Hiemstra and Hauff 2010]. We also know that the anchor text is a strong relevance signal from our own experiments in TREC 2011 [Boytsov and Belova 2011]. On the other hand, the size of the anchor text is much smaller than size of the text for a full collection. Thus, enriching the Category B index (built over 50M documents) with the Category A anchor text index (built over 370M short documents), seemed to be an appealing method of improving performance at little cost.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Anchor Text, Spam Filtering and Wikipedia for Web Search and Entity Ranking

In this paper, we document our efforts in participating to the TREC 2010 Entity Ranking and Web Tracks. We had multiple aims: For the Web Track we wanted to compare the effectiveness of anchor text of the category A and B collections and the impact of global document quality measures such as PageRank and spam scores. For the Entity Ranking Track, we use Wikipedia as a pivot to find relevant ent...

متن کامل

Selecting the Right Cause from the Right Category: Does the Role of Product Category Matter in Cause-Brand Alliance? A Case Study of Students in Shanghai Universities

Increased competition is making it difficult to distinguish products solely by attributes, creating room for cause-related marketing. In this study with a sample of 322 university students, we evaluated the changes in consumer attitudes toward cause and brand as consequences of Cause Brand Alliance (CBA), by using the product category as moderator. Four popular brands from two product categorie...

متن کامل

Using anchor text for homepage and topic distillation search tasks

Past work suggests that anchor text is a good source of evidence that can be used to improve web searching. Two approaches for making use of this evidence include fusing search results from an anchor text representation and the original text representation based on a document’s relevance score or rank position, and combining term frequency from both representations during the retrieval process....

متن کامل

Does routine repeat testing of critical laboratory values improve their accuracy?

  Background: Routine repeat testing of critical laboratory values is very common these days to increase their accuracy and to avoid reporting false or infeasible results. We figure that repeat testing of critical laboratory values has any benefits or not.   Methods : We examined 2233 repeated critical laboratory values in 13 different hematology and chemistry tests including: hemoglobin, white...

متن کامل

Ad Hoc and Diversity Retrieval at the University of Delaware

We indexed ClueWeb using the Indri retrieval engine [6]. Due to disk space constraints, we elected to use the Category B subset of 50 million English-language web pages only. We indexed the full documents. We included field information such as title, headings, and bold/italic markup, and dropped script and style tags. We did not index anchor text. We used the Krovetz stemmer and a simple stopwo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012